Biostatistics For Dummies, 2nd Edition (Monika Wahi, John Pezzullo)

CHAPTER 7 Having Designs on Study Design 93

But the problem with ecologic studies is that the experimental unit is a whole

population — not an individual. What if the individuals in the United States who

ate low-fat diets were actually the ones to die of CHD? And what if the ones who

ate high-fat diets were more likely to die of something else? Attributing the

behavior of a group to an individual is called the ecologic fallacy, and can be a prob-

lem with interpreting results like the ones shown in Figure 7-3.

That is why we also have cross-sectional studies, where the experimental unit is

an individual, not a population. A cross-sectional study takes measurements of

individuals at one point in time — either through an in-person hands-on exami-

nation, or by survey (over the phone, Internet, or in person). The National Health

and Nutrition Examination Survey (NHANES) is a cross-sectional surveillance

effort done by the U.S. government on a sample of residents every year. NHANES

makes many measurements relevant to human health in the United States, includ-

ing dietary fat intake as well as status of many chronic diseases including CHD. If

an analysis of cross-sectional data like NHANES found that there was a strong

positive association between high dietary fat intake and a CHD diagnosis in the

individuals participating, it would still be weak evidence for causation, but would

be stronger than what was found in the ecologic study presented in Figure 7-3.

Going from case series to case-control

The reason that there are two types of analytic study designs — case-control

studies and cohort studies — is that cohort study designs do not work for statisti-

cally rare conditions. We use the term statistically rare because if someone you love

gets cancer, cancer does not seem very rare. Yet, if you enroll a cohort of thou-

sands of individuals including your loved one (who is free of cancer) in a cohort

study and measure this cohort yearly to see who is diagnosed with cancer, it would

take many years to get enough outcomes to be able to develop the regression

models (like the ones described in Chapters 16 through 23) that would be neces-

sary for causal inference. So for statistically rare conditions like the various can-

cers, you use the case-control design.

You can use a fourfold, or 2x2, table to better understand how case-control studies

are different from cohort studies. (Refer to Chapters 13 and 14 for more about 2x2

tables.) As shown in Figure 7-4, the 2x2 table cells are labeled relative to exposure

status (the rows) and outcome or disease status (the columns). For the columns,

D+ stands for having the disease (or outcome), and D– means not having the dis-

ease or outcome. Also, for the rows, E+ means having the exposure, and E– means

not having the exposure. Cell a includes the counts of individuals in the study who

were positive for both the exposure and outcome, and cell d includes the counts of

individuals who were negative for both the exposure and outcome (a and d are

concordant cells because the exposure and outcome statuses agree). In the